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Hydrologic models use relatively simple mathematical equations to 
conceptualize and aggregate the complex, spatially distributed, and 
highly interrelated water, energy, and vegetation processes in a 
watershed. A consequence of process aggregation is that the model 
parameters often do not represent directly measurable entities and must, 
therefore, be estimated using measurements of the system inputs and 
outputs. During this process, known as model calibration, the 
parameters are adjusted so that the behavior of the model approximates, 
as closely and consistently as possible, the observed response of the 
hydrologic system over some historical period of time. This Chapter 
reviews the current state-of-the-art of model calibration in watershed 
hydrology with special emphasis on our own contributions in the last 
few decades. We discuss the historical background that has led to 
current perspectives, and review different approaches for manual and 
automatic single- and multi -objective parameter estimation. In 
particular, we highlight the recent developments in the calibration of 
distributed hydrologic models using parameter dimensionality 
reduction sampling, parameter regularization and parallel computing. 
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Finally, this chapter concludes with a short summary of methods for 
assessment of parameter uncertainty, including recent advances in 
Markov chain Monte Carlo sampling and sequential data assimilation 
methods based on the Ensemble Kalman Filter. 


1. Introduction 

Hydrologic models serve as important tools for improving our 
knowledge of watershed functioning (understanding), for providing 
critical information in support of sustainable management of water 
resources (decision making), and for prevention of water-related natural 
hazards such as flooding (forecasting/prediction). Hydrologic models 
consist of a general structure, which mathematically represents the 
coupling of dominant hydrologic processes perceived to control 

KXGURORJLF □ EHKDYLRUD RID 3 PDQ\D DVLPLODUD □ ZDWHUVKHGV'D □ 7UDGLWLRQI 
general model structure is then used for simulation and/or prediction of 
the KAGURORJLFD EHKDYLRUD RID DD 3 SDUWEF§h0pJd CbjZDWHUVKHG' 
HVWLPDWLQJD WKHD XQNQRZQD FRHIILFLHQWVD D NQRZQD DVD 3 SDUDPHWHUV'[ 
mathematical expressions embodied within. 

No matter how sophisticated and spatially explicit, all hydrologic 
models aggregate (at some level of detail) the complex, spatially 
distributed vegetation and subsurface properties into much simpler 
homogeneous storages with transfer functions that describe the flow of 
water within and between these different compartments. These 
conceptual storages correspond to physically identifiable control 
volumes in real space, even though the boundaries of these control 
volumes are generally not known. A consequence of this aggregation 
process is that most of the parameters in these models cannot be inferred 
through direct observation in the field, but can only be meaningfully 
derived by calibration against an input - output record of the watershed 
response. In this process, the parameters are adjusted in such a way that 
the model approximates, as closely and consistently as possible, the 
response of the watershed over some historical period of time. The 
parameters estimated in this manner are therefore effective conceptual 
representations of spatially and temporally heterogeneous properties of 
the watershed. Therefore, successful application of any hydrologic model 
depends critically on the chosen values of the parameters. 

In this chapter, we review the current state-of-the-art of model 
calibration in watershed hydrology. We discuss manual and automatic 
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parameter estimation techniques for calibration of lumped and spatially 
distributed hydrologic models. Specific methods include the widely used 
SCE-UA (Shuffled Complex Evolution - University of Arizona) and 
MOCOM-UA (Multi Objective COMplex evolution - University of 
Arizona) approaches for single- and multi-objective model calibration, 
step-wise (MACS: Multi-step Automatic Calibration Scheme) and 
sequential parameter estimation methods (DYNIA: DYNamic 

Identifiability Analysis; and PIMLI: Parameter Identification Method 
based on the Localization of Information), and emerging simultaneous 
multi-method evolutionary search methods (AMALGAM: A Multi- 
ALgorithm Genetically Adaptive Multiobjective). We highlight recent 
developments in the calibration of distributed hydrologic models 
containing spatially distributed parameter fields, using parameter 
dimensionality reduction sampling, parameter regularization and parallel 
computing. The chapter concludes with a short summary on 
methodologies for parameter uncertainty assessment, including Markov 
chain Monte Carlo sampling and sequential data assimilation using the 
Ensemble Kalman Filter (EnKF); here we discuss the RWM (Random 
Walk Metropolis), SCEM-UA (Shuffled Complex Evolution Metropolis- 
University of Arizona), DREAM (DiffeRential Evolution Adaptive 
Metropolis) and SODA (Simultaneous Optimization and Data 
Assimilation) sampling algorithms. Note that, although our discussion is 
limited to watershed models, the ideas and methods presented herein are 
applicable to a wide range of modeling and parameter estimation 
problems. 

2. Approaches to Parameter Estimation for Watershed Models 

There are two major appURDFKHVI WRI SDUDPHWHUffi ^FKWLPD WLRQ □ □ 2 3 

HVWLPDWLRQTB^GSbh - HV WLPD WLRQ ' □ YL&libPRiSHahl a 

priori estimation, values of the model parameters are specified without 

recourse to the observed dynamic hydrologic response (e.g. streamflow) 

of the watershed under study. Calibration, on the other hand, involves the 

selection of a parameter set that generates model responses that 

reproduce, as closely as possible, the historically observed hydrologic 

response of a particular watershed. Therefore, calibration can only be 

performed when long-term historical measurements of input-state-output 

behavior of the watershed (including streamflow, precipitation and 

potential evaporation) are available. 
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Parameter estimation strategies are intimately tied to the degree of 
hydrologic process representation embedded within the model. 
Hydrologic models can be classified accordingly as conceptual or 
physically-based. Most hydrologic models in use today are of the 
conceptual type, that conceptualize and aggregate the complex, spatially 
distributed, and highly interrelated water, energy, and vegetation 
processes in a watershed into relatively simple mathematical equations 
without exact treatment of the detailed underlying physics and basin- 
scale heterogeneity. Typical examples of conceptual type models are the 
SAC-SMA (SACramento Soil Moisture Accounting) model, 1 ’ 2 and the 
HBV (Hydrologiska Byrans Vattenbalansmodell) model 3 . Due to 
hydrologic process aggregation, the parameters in these models cannot 
generally be measured directly in the field at the desired scale of interest. 
Instead, when using conceptual type models, only the ranges of feasible 
parameter values can generally be specified a priori (perhaps with the 
combined knowledge of model structure and of dominant watershed 
processes). Calibration is then employed to select parameter estimates 
(from within the a priori defined ranges) that capture, as closely and 
consistently as possible, the historical record of the measured (target) 
hydrologic response of the watershed the model is intended to represent. 

Spatially distributed physically-based hydrologic models contain a 
series of partial differential equations describing physical principles 
related to conservation of mass, momentum (and energy). Typical 
examples of physically-based models are MIKE-SHE 4 (Systeme 
Hydrologique European) and KINEROS 5 (KINematic Runoff and 
EROSion). Their spatially distributed physically-based structure provides 
two potential strengths: 1) the ability to account for the spatial variability 
of runoff producing mechanisms and 2) the ability to infer model 
parameter values directly from spatio-temporal data by establishing 
physical or conceptual relationships between observable watershed 
characteristics (e.g., geology, topography, soils, land cover, etc.) and the 
parameters for the hydrologic processes represented in the model. 6 ’ 5,7,8 ’ 9 
The latter will be defined as the 3 ORFD0'|Morz parameter estimation 
approach and is particularly valuable for implementing hydrologic 
models in poorly gauged and ungauged watersheds where local response 
data is sparse or non-existent. In general, parameters estimated via the 
local a priori approach will still require some degree of fine-tuning via a 
calibration approach to obtain effective values that account for the 
influencing factors, such as heterogeneity, emergent processes, 
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differences in scale between model (larger scale) and the embedded 
hydrologic theory (developed at point/small scale). This refinement 
process ensures proper consistency between the model input-state-output 
behavior and the available response data. 10,11,12 

Another parameter estimation strategy, developed mainly for the 
implementation of conceptual type of models in ungauged basins, is 
FDOOHGD WKHI W^RjQE^rbach. 13,14,15,16,17 7KHD 3 UHJLRQDOL]HG' □ 
approach involves the development of regional regression relationships 
between the model parameter values estimated for a large number of 
gauged basins (via calibration) and observable watershed characteristics 
(i.e. landcover and soil properties) at those locations. The idea is that 
these relationships can be used to infer parameter estimates for 

3 K\GURORJLFDOO\ □ VLPLODU'IH XQJDXJHGD EDVLQVDI I JLYHQD NQRZOHGJHI I RID 

observable watershed characteristics. A major assumption of the regional 

a priori approach is that the calibrated model parameters are uniquely 

and clearly related to observable watershed properties. This assumption 

can be difficult to justify when many combinations of parameters are 

found to produce similar model responses due to parameter interaction, 18 

measurement uncertainty 19 and model structural uncertainty 20 , and can 

therefore result in ambiguous and biased relationships between the 

parameters and the watershed characteristics 17 . One way to improve the 

efficiency of the regionalized approach is to impose conditions (via 

watershed characteristics) on the calibrated parameters. 21 In an 

alternative approach, Yadav et al. 22 proposed regionalization of 

streamflow indices. In this approach, relationships between streamflow 

indices and physical watershed characteristics are established at the 

gauged locations. The regionalized flow indices, providing dynamic 

aspects of the ungauged watersheds, are then used to constrain 

hydrologic model predictions (and parameters). One advantage of this 

approach is that, the regionalized indices are independent of model 

structure and therefore can be used to constrain any watershed model. 

The paragraphs above provide a broad overview of approaches 
commonly used by the hydrologic community to specify values of the 
parameters in hydrologic PRGHOVI □ QDEH^ttbrTD DSSURDFKI I DQGl I 
3 FDOLEUD WLRQ ' □ □ 7KHI] VSHFLILF<thd]ftF«Wl 6RI(fc^IWMfi6fl lof 
hydrologic models, and in the following sections we therefore provide a 
more detailed overview of various calibration strategies that have been 
developed within the water resources context and have found widespread 
use in the hydrologic community. We discuss these methods within the 
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context of their historical development, including current and future 
perspectives. 


2.1. Model Calibration 

For a model to be useful in prediction, the values of the parameters need 
to accurately reflect the invariant properties of the components of the 
underlying system they represent. Unfortunately, in watershed 
hydrology, many of the parameters can generally not be measured 
directly, but can only be meaningfully derived through calibration 
against historical record of dynamic response (traditionally streamflow) 
data. Calibration is an iterative process in which the model parameters 
are adjusted so that the dynamic response of the model represents, as 
closely as possible, the observed target response (e.g. outlet streamflow) 
of the watershed. Figure 1 provides a schematic representation of the 
resulting model calibration problem. In this figure, ® represents 
observations of the forcing (rainfall) and streamflow response that are 
subject to measurement errors and uncertainty, and therefore may be 
different from the true values. Similarly, & represents the hydrologic 
model with functional response 
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Fig. 1. A schematic representation of the general model calibration problem The model 
parameters are iteratively adjusted so that the predictions of the model, 0 , (represented 
with the solid line) approximate as closely and consistently as possible the observed 
response (represented with dotted line). 

Mathematically, the model calibration problem depicted in Figure 1 
can be formulated as follows. Let S=<$(y/,X,0) denote the streamflow 
predictions } of the model & with observed forcing X (rainfall 

and potential evapotranspiration), state vector y, and model parameters 
6. Let 5'={5 l ,...^ n } represent a vector with n observed streamflow 

values. The difference between the model-predicted and measured 
streamflow can be represented by the residual vector E as: 

E(0) ={T{S)-T{S)y{T{s,)-T{s,\. . . ns„)~ T{s n )}={e,(0),. . . ?„(&)} (1) 

where T() allows for various monotonic (such as logarithmic) 
transformations of the model outputs. 

Traditionally, we seek values for the model parameters that result in a 
minimal discrepancy between the model predictions and observations. 

This can be done by minimizing an aggregate summary statistic of the 
residuals: 

/TO=/^(^(^...*(fl} (2) 

where the function F(6) allows for various user-selected linear and non- 
linear transformations of the residuals and is interchangeably called a 

3 FULWHULRQ ' □ □ 3 PHDVXUH' □ RU □ 3 REMHF WLYH □ IXQFWLRQ ' □ LQ □ WKH □ ZDWHUVKI 
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literature. Note that in typical time series analysis the influence of the 
initial condition, y/, on the model output diminishes with increasing 
distance from the start of the simulation. In those situations, it is common 
to use a spin-up period to reduce sensitivity to state- value initialization. 

Boyle et al. 23 classified the process of model calibration into three 
levels of increasing sophistication. In level zero, approximate ranges for 
the parameter estimates are specified by physical reasoning that 
incorporates available data about static watershed characteristics (e.g. 
geology, soil, land cover, slope, etc.), via lookup tables or by borrowing 
values from similar watersheds. At this level, only crude a priori ranges 
of the parameters are estimated without conditioning on the input-output 
streamflow response data. In level one, the parameter ranges are refined 
by identifying and analyzing the characteristics of specific segments of 
the response data, that are thought to be controlled by a single or a group 
of parameter(s). In this level, the effects of parameter interactions are 
ignored. Finally, in level two, a detailed analysis of parameter 
interactions and performance trade-offs is performed using a carefully 
chosen representative period of historical data (i.e. calibration data) to 
further refine the parameter ranges or select a representative parameter 
set. This level involves a complex, iterative process of parameter 
adjustments to bring the simulated response as closely and consistently 
as possible to the observed watershed response. 

Calibration can be further divided into two types, depending on 
whether this iterative process is being guided manually by an expert 
hydrologist or automatically by a computer following pre-defined 
algorithmic rules. These approaches are called 3 mDQXDOOFDOLEUfi&VLRQ' 
3 automated calLEUDWLR%$M&\ve\y . 

2.1.1. Overview of the Manual Calibration Approach 

Manual calibration is a guided trial-and-error process performed by an 
expert hydrologist, involving complex knowledge-based analyses to 
match the perceived hydrologic processes in the watershed with then- 
conceptual equivalents represented in the model structure. This 
interactive process can involve a variety of graphical interfaces and a 
multitude of performance measures to transform the historical data into 
information that will aid the hydrologist in decision-making. Although 
progressive steps during manual calibration of a model are generally 
established via pre-defined guidelines, the actual sequence of procedures 
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will vary based on the experience and training of the modeler, the 
properties of the data and characteristics of the watershed system being 
modeled. 23 Successful manual calibration, therefore, requires a good 
knowledge of the physical and response characteristics of the watershed, 
as well as a good understanding of the structure and functioning of the 
various model components and parameters. The major aim is to find 
values for the parameters that are consistent with the hydrologic 
processes they were designed to represent. 24 The consistency between 
model simulations of hydrologic behaviors for which observations are 
available is examined at various timescales and time-periods to try and 
isolate individual effects of each parameter. Hydrologic behaviors 
include, for example, annual averages to understand the long-term water 
balance dynamics, seasonal and monthly averages to identify the trends 
and low-high flow periods, extended recession periods to understand 
watershed baseflow characteristics and event based measures to analyze 
the shape and timing of floods. 2,25 Manual calibration is expected to 
produce process-based (i.e. conceptually realistic) and reliable 
predictions. The National Weather Service (NWS) of the United States, 
for example, primarily uses a manual approach (assisted with automatic 
techniques) to estimate the parameters of their lumped hydrologic 
models used in operational streamflow forecasting. The manual 
calibration, however, is a time- and labor-intensive process involving 
many subjective decisions. Due to this subjectivity, different modelers 
will most likely produce different model parameter values for the same 
watershed. Another difficulty in manual calibration is the ever-increasing 
complexity of watershed models. As more parameters are added, it 
becomes more difficult to estimate them accurately, parameter 
interaction and compensating effects on the modeled variables make it 
difficult to estimate individual parameters. Note that Bayesian methods 
(see Sec. 6) provide a general framework for explicit use of expert 
knowledge/belief in the form of priors. An excellent and comprehensive 

*\A 

discussion of manual calibration is given by three recent reviews. ’ ’ 

2.1.2. Overview of Automated Calibration Approaches 

Automated parameter identification (calibration) methods rely on an a 
priori model structure, optimization algorithm and one or more 
mathematical measures of model performance (often called objective 
function, criterion or measure) to estimate the model parameters using a 
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historical record of observed response data. The advantages of the 
automated calibration approach are not difficult to enumerate. Such 
methods use objective, rather than visual and subjective, measures for 
performance evaluation, and exploit the power and speed of computers to 
efficiently and systematically search the feasible model parameter space. 

Within this context, the goal has been to develop an objective strategy 
for parameter estimation that provides consistent and reproducible results 
independent of the user. 23 A potential disadvantage is that automatic 
calibration algorithms can, if not properly designed, return values of the 
model parameters that are deemed to be hydrologically unrealistic. It is, 

therefore, WKHI PRGHOHUUVD UHV SRQVLELOLWXD WRI H[DPLQHD WKHD UREXVWQH\ 
optimized parameters. A traditional approach, accepted as a minimum 

UHTXLUHPHQW □ IRU □ 3 HYDOXD WLRQ ' □ RI □ SDM^H^dftJthtJREXVWQHWI 1 □ LVD 

modeled and observed response for an independent time period that is 

not used for calibration. This can be done using a split-sample approach 

and should cover a long enough time period so as to contain various 

ranges of hydrologic conditions (e.g. dry and wet periods), and, if 

possible, using multiple response data. 26,27 

In the following sections, we will discuss single- and multi-objective 
model calibration strategies. Single objective methods utilize a single 
measure of closeness between the simulated and observed variables, 
whereas multi-objective approaches attempt to emulate the strengths of 
the manual calibration approach by simultaneously using multiple 
different measures of performance. 


3. Single Criterion Automatic Calibration Methods 

Traditional automated approaches to model calibration utilize a single 
mathematical criterion (sometimes defined as objective function or 

measure! I □ WRD VHDUFKI] IRUD DD XQLTXHI ] 3 EHVW'D SDUDPHWHUD VHWD □ $QD I 

function (Eq. 2) can be defined as an aggregate statistical summary 

TXDQWLIVLQ J □ WKHD 3 FORMRPJTW61BIRW'n EHWZHHQl WKHD VLPXODWHGD 

and observed hydrologic variable(s). The objective function largely 

influences the outcome of the automated calibration procedure and, 

hence, should be carefully selected based on the goal of the modeling. 

Traditionally, automated calibration problems seek to minimize the 
discrepancy between the model predictions and observations by 
minimizing the following additive simple least square (SLS) objective 
function (or its variations) with respect to G: 
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^SL^P) ~ ^ e i (P) ( 3 ) 

i=l 

Often □ □ WKHIH VTXDUHD URRWD IRUPD □ VXFKD DVD 3 5RRWI I OHDQl I 6TXDUHGD 

(RMSE) criterion, is used because it has the same units as that of the 

variable being estimated (e.g. streamflow discharge). An RMSE 

objective function puts strong emphasis on simulating the high flow 

events; to increase the emphasis on the low flows, it can be used after 

performing a log-transformation of the output variables. 

The popularity of the SLS function stems from its statistical 
properties; the SLS is an unbiased estimator under strong assumptions 
related to the distribution of the residuals — i.e. the residuals are pair- 
wise independent, have constant variance (homogenous) and normally 
distributed with a mean value of zero (see also Sorooshian and Gupta, 28 
and Gupta et al. 29 ). However, the validity of these assumptions is 
questionable within a hydrologic context, as most hydrograph 
simulations published to date in the literature show significant non- 
stationarity of error residuals. 30,31 


3.1. A Historical Perspective 

A powerful optimization algorithm is a major requirement to ensure 

finding parameter sets that best fit the data. Optimization algorithms 

iteratively explore the response surface (a surface mapped out by the 

objective function values in the parameter space) WRI TLQGI I WKHD 3 EHVW'D RUD 

3 RSWLPXP'nSDUDPHWHUn\Hf\^dpaS3aHaiffHgffiHfflis developed 

in the past to solve the nonlinear SLS optimization problem stated in Eq. 

(3) may be classified as local search methodologies if they are designed 
to seek a systematic improvement of the objective function using an 
iterative search starting from a single arbitrary initial point in the 
parameter space, or as stochastic global search methods if multiple 
concurrent searches from different starting points are conducted within 
the parameter space. Local optimization approaches can be classified into 
two categories, based on the type of search procedure employed, namely 
derivative-based (gradient) and derivative-free (direct) methods. 

Gradient-based methods make use of the estimates of the local 
downhill direction based on the first and/or second derivative of the 
response surface with respect to each individual model parameter. 32 The 
simplest gradient-based algorithm is that of steepest descent, which 
searches along the first-derivative direction for iterative improvement of 
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the objective function. Newton-type methods, such as the Gauss-Newton 
family of algorithms, are examples of second derivative algorithms. 
Gupta and Sorooshian 33 and Hendrickson et al . 34 demonstrate how 
analytical or numerical derivatives can be computed for conceptual 
watershed models. 

Derivative-based algorithms will evolve towards the global minimum 
in the parameter space in situations where the objective function exhibits 
a topographical convex shape in the entire parameter domain. 
Unfortunately, numerous contributions to the hydrologic literature have 
demonstrated that the response surface seldom satisfies these restrictive 
conditions, and exhibits multiple optima in the parameter space. Local 
gradient-based search algorithms are not designed to handle these 
peculiarities, and therefore often prematurely terminate their search at a 
final solution that is dependent on the starting point in the parameter 
space. Another related problem is that many of the hydraulic parameters 
typically demonstrate significant interactions, because of an inability of 
tiie observed experimental data to properly constrain all of the calibration 
parameters. 

Direct search methods sample the value of the objective function in a 
systematic manner without computing derivatives of the response surface 
with respect to each parameter. Popular examples of direct search 
methods include the Simplex Method , 35 the Pattern Search Method 36 and 
the Rotating Directions Method of Rosenbrock 37 . Many studies have 
focused on comparative performance analysis of local search methods 
for calibration of watershed models . 38,39,40 Their general conclusion was 
that local search methods were not powerful enough to reliably find the 
best (global optimum) values of the watershed model parameters. The 
main limitation of local search methods is that, like gradient-based 
algorithms, their outcome is highly dependent on their initial starting 
point. For example, local search algorithms are prone to getting trapped 
in local basins of attraction (local minima) and, as argued above, may 
become confused in finding the preferred direction of improvement in 
the presence of threshold structures and other undesirable irregularities 
of the response surface. 

Among others, Moore and Clarke 41 and Sorooshian and Gupta 42 
pointed out that the causes of the above difficulties were mainly 
difficulties concerned with the underlying model structure and that local 
search methods were not powerful to do the job. Gupta and Sorooshian 43 , 
for example, focused on model structural inadequacies in the SAC-SMA 
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(Sacramento Soil Moisture Accounting) model and showed that 
parameter identifiability can be improved by a careful re- 
parameterization of the percolation function. Other seminal 
contributions 30,42 concluded that a properly chosen objective function, 
which can better recognize the stochastic nature of the errors in the 
calibration data (such as those derived using Maximum Likelihood 
Theory), can result in smoother response surfaces for which the global 
optimum is easier to identify. In later studies, 30,42 the authors pointed out 
that streamflow measurements contain errors that are temporally auto- 
correlated and heteroscedastic (having non-constant, magnitude 
dependent variance) and also introduced a Heteroscedastic Maximum 
Likelihood Estimator (HMLE) that accounts for non-stationary error 
variance arising from various sources, including the rating curve used to 
convert the stage measurements (in units of height) into runoff rate (in 
units of discharge, e.g. cubic feet per day). In a parallel work, Kuczera 44 
proposed a methodology based on Bayesian statistics to properly account 
for the measurement error properties of the data while predicting the 
confidence bounds for the parameter estimates. Recently, Kavetski et 
al. 45,46 and Vrugt et al. 47,48 have extended that approach to account for 
error in rainfall depths as well. Another commonly used methodology 
applies a parameterized power transformation to the streamflow data to 
stabilize the heteroscedastic measurement error: 49 


,_ (y + l)M 

A 


( 4 ) 


where y and y represent flows in the original and transformed spaces 
respectively and X is the transformation parameter (a commonly used 
value is X ~ 0.0 to 0.3). 

Other researchers have investigated the requirements for calibration 
data and pointed out that the quantity and quality of calibration data play 
a critical role in controlling the success of the calibration procedure. 
These studies concluded that the informativeness of the data is far more 
important than the length and amount used for calibration. 50,51,52,53,54 

The convergence problems encountered with local search algorithms 
have inspired researchers to develop and test global search algorithms for 
calibration of watershed models. While local optimization methods rely 
on a single initial point within the feasible parameter space to start the 
search, global optimization methods utilize multiple concurrent searches 
from different starting points to reduce the chance of getting stuck in a 
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single basin of attraction. Examples of global optimization algorithms 
applied to watershed model calibration include the Random Search (RS) 
method, 55 Adaptive Random Search (ARS), 56 ARS coupled with direct 
local search methods, 57 Controlled Random Search, 58,59 Simulated 
Annealing, 60,61 the multi-start simplex, 62 and genetic algorithm. 63,62,64 For 
a detailed overview of global search algorithms, please see Duan 65 . With 
the advent of computational power, Duan et al. 62 conducted a seminal 
study; focusing first on a detailed analysis of the properties of the 
response surface, they identified five major characteristics that 
complicate the optimization problem in watershed models: 

□ □,WnFRQWDLQVDPRUHnWKDQnRQHnPDLQnUHJLRQnRinDWWUDFWLRQD 

□ □ ,W □ KD V □ PDQ\D ORFDO □ RSWLPD □ ZLWKLQ QHDFKD UHJLRQ □ RI □ DWWUDFWLRQ 

□ □ ,WDLV®ll®^§Kffltinuous derivatives. 

□ □ ,W □ LV □ IOD W □ QHDUI IWKHI RSWLPXP I I ZLWKtfl p»Mid:t® QGD YDUXLQJD GHJUHH 
sensitivities. 

□ □jWVDVKDSHSiikMaQRQcludes long and curved ridges. 

In an effort to design an optimization strategy capable of dealing with 
these difficulties in single-objective calibration problems, Duan et al. 62 
introduced a novel procedure called the Shuffled Complex Evolution - 
University of Arizona (SCE-UA). 


3.2. The Shuffled Complex Evolution - University of Arizona ( SCE- 
UA ) Algorithm 

The Shuffled Complex Evolution Algorithm developed at the University 
of Arizona (SCE-UA) 62,66,67 is a global search strategy that synthesizes 
the features of the simplex procedure, controlled random search 58 and 
competitive evolution 68 with the newly introduced concept of complex 
shuffling. The SCE-UA algorithm has since been employed in a number 
of studies and proved to be consistent, effective, and efficient in locating 
the global optimum to the parameter estimation problems for watershed 
models. 67,69,70,71,72 . In brief, SCE-UA algorithmic steps can be listed as 
follows: 62,66,73 


(1) Generate initial population: sample s points randomly in the 
feasible {a priori) parameter space (using uniform distribution, 
unless prior information exists) and compute the objective 
function value at each point. 

(2) Ranking: sort the s points in order of increasing objective 
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function value so that the first point represents the smallest 
criterion value and the last point represents the largest criterion 
value (assuming that the goal is to minimize the criterion value). 

(3) Partitioning into complexes: partition the s points into p 

complexes, each containing m points. The complexes are 
partitioned such that the first complex contains every p(k-l)+l 
ranked point, the second complex contains every p(k-l)+2 
ranked point, and so on, where k = 1,2 m. 

(4) Complex evolution: evolve each complex according to the 
competitive complex evolution algorithm, which is based on the 
Simplex downhill search scheme. 35 The evolution procedure 

JHQHUD WHV I QHZD SRLQWVI FDOOHGD 3 RHVSULQJ'D WKDWIH □ RQl I DYHU1 
within the improvement region. 

(5) Complex shuffling: combine the points in the evolved complexes 
into a single sample population; sort the sample population in 
order of increasing criterion value; shuffle (i.e. re-partition) the 
sample population into p complexes according to the procedure 
specified in Step (3). 

(6) Check for convergence: if any of the pre-specified termination 
criteria are satisfied, stop; otherwise, continue. Termination 
criteria can be specified as maximum number of iterations (or 
maximum number of shuffling) or parameter convergence. 

Experience with the method has indicated that the effectiveness and 
efficiency of the SCE-UA algorithm is influenced by the choice of a few 
algorithmic parameters. Duan et al. 66 ’ 73 performed sensitivity studies and 
suggested practical guidelines for selecting these algorithmic parameters 
according to the degree of difficulty of the optimization problem. The 
primary parameter to be selected is the number of complexes, p. The 
above studies showed that the dimension of the calibration problem (i.e. 
number of parameters to be optimized), n, is the primary factor 
determining the proper choice of p\ practically (if no other information is 
available) p is set to the greater value between 2 or n (see also 
Kuczera 72 ). The size of a complex, m, is generally chosen to be equal to 
2n + 1. Accordingly, the sample (population) size, s, becomes the 
product p Um. The number of offspring, /?, that can be generated by each 
independently evolving complex between two consecutive shuffles is the 
same as the complex size (2n + 1), the size of each sub-complex selected 
for generation of an offspring (via Simplex scheme) is n + 1 and defines 
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a first order approximation to the objective function space. The number 
of consecutive offspring generated by each sub-complex, a, is equal to 1. 
In selecting these algorithmic parameters, a balance between algorithm 
effectiveness and efficiency should be sought. For instance, selecting a 
large number of complexes increases the probability of converging to the 
global optimum, however at the expense of a larger number of 
simulations (and hence longer computational time). The SCE-UA code is 
available free of charge from the following web address: 
www. sahra.arizona. edu/soft ware. 

While significant progress has been made in the use of global 
optimization algorithms for parameter estimation, the current generation 
of optimization algorithms typically implements a single operator (i.e. 
Simplex search in the case of SCE-UA) for population evolution. 
Reliance on a single model of natural selection and adaptation presumes 
that a single method can efficiently evolve a population of potential 
solutions through the parameter space and work well for a large range of 
problems. However, existing theory and numerical benchmark 
experiments have demonstrated that it is impossible to develop a single 
universal algorithm for population evolution that is always efficient for a 
diverse set of optimization problems. 74 This is because, the nature of the 
response surface often varies considerably between different 
optimization problems, and often dynamically changes en route to the 
global optimal solution. It therefore seems productive to develop a search 
strategy that adaptively updates the way it generates offspring based on 
the local peculiarities of the response surface. 

In light of these considerations, Vrugt and Robinson 75 and Vrugt et 
al. 76 have recently introduced a new concept of self-adaptive multi- 
method evolutionary search. This approach, termed as A Multi 
Algorithm Genetically Adaptive Method (AMALGAM), runs a diverse 
set of optimization algorithms simultaneously for population evolution 
and adaptively favors individual algorithms that exhibit the highest 
reproductive success during the search. By adaptively changing 
preference to individual algorithms during the course of the optimization, 
AMALGAM has the ability to quickly adapt to the specific peculiarities 
and difficulties of the optimization problem at hand. A brief algorithmic 
description of AMALGAM for solution of multi -objective optimization 
problems is given in Section 4. Synthetic benchmark studies covering a 
diverse set of problem features, including multimodality, ruggedness, ill- 
conditioning, non-separability, interdependence (rotation) and high- 
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dimensionality, have demonstrated that AMALGAM significantly 
improves the efficiency of evolutionary search. 75,76 An additional 
advantage of self-adaptive search is that the need for algorithmic 
parameter tuning is reduced, thus increasing applicability to solving 
search and optimization problems in many different fields of study. An 
extensive algorithmic description of AMALGAM, including comparison 
against other state-of-the-art optimization methods, can be found in 
Vrugt and Robinson 75 and Vrugt et al. 76 . The AMALGAM code is 
available from the second author upon request. 


3.3. Limitations of Single Criterion Methods 

Despite these algorithmic advances, automated model evaluation 
strategies that rely on a single regression-based aggregate measure of 
performance (e.g. RMSE) are, in general, weak and make it 
unnecessarily difficult to isolate the effects of different parameters on the 
model output. 10,77,78 Hence, two different parameter combinations might 
give completely different streamflow responses, but result in very similar 
values of the objective function. This is undesirable. A major reason for 
this is the loss (or masking) of valuable information inherent in the 
process of projecting from the high dimension of the data set ($R Data ) 
down to the single dimension of the residual-based summary statistic 
(9? '), leading to an ill-posed parameter estimation problem ($R Parameter < 
9? Data ).77 To avoid (or at least minimize) this problem, an optimization 
strategy must necessarily make use of multiple, carefully selected, 
measures of model performance, thereby more closely matching the 
number of unknowns (the parameters) with the number of pieces of 
information (the measures), resulting in a better-posed identification 
problem. There is, therefore, an urgent need to develop mathematical 
theory that more convincingly proves this line of thought and provides 
ways forward to improve parameter inference. This is especially pressing 
within the context of spatially distributed models that contain a manifold 
of parameters for which little compelling a priori information is 
available about appropriate values. 


4. Multi-Criteria Automatic Calibration Methods 

Multi-criteria analysis can be used to assimilate information from 
multiple non-commensurable (i.e. not measurable by the same standard) 
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sources. 10 The goal is to increase the extraction of information content 
from the data (decrease the gap between measure and parameter 
dimensions), by properly expressing the different important aspects of 
model performance. For instance, a number of criteria can be formulated, 
each of which is sensitized to a specific watershed output flux (e.g. 
water, energy, chemical constituents) for which measurements are 
available. 79,27 In principle, each criterion can also be designed to isolate a 
different characteristic behavior of some component of the physical 
system. 80 Note that the process of interactive manual-expert evaluation 
and calibration of a model, following a process-guided set of rules, 
actually follows a powerful (albeit somewhat subjective) multi -criteria 
approach, wherein a variety of graphical and numerical tools are used to 
highlight different aspects of model response. ’ ’ A major advantage of 
the automated multi-criteria approach is that various aspects of the 
manual calibration strategy can be absorbed into the calibration process, 
thus strengthening the physical basis of identified parameters. Many 
automated (or semi-automated) multi-criteria calibration strategies have 
been proposed for calibration of watershed models. These strategies can 
be broadly classified into simultaneous, step-wise and constraining 
approaches. These three approaches are discussed next. 


4.1. Simultaneous Multi-criteria Calibration Approach 

The simultaneous multi-criteria approach finds a set of solutions (so- 

FDOOHGD 3 3DUHWRD RSWLPDO' □ UHJLRQD □ WKDWI I VLPXOWDQHRXVO\D RSWLPLJHE 

optimization run) and trade-off the performance of several user-selected 

criteria that measure different aspects of model performance. 10,81,82,11 In 

general, the multi-criteria model constraining problem can be expressed 

in the following form: 10 

min F mc (6) = {F,(0),F 2 (0),..F n (6)} © (5) 

w.r.t.O 

where F x {ff),F 2 {0),...f' n {O) represent the different performance criteria 
summarizing information related to various components of the physical 
system. To solve this multi-criteria problem, model parameter sets (0) 
are systematically sampled from their a priori region (<9) in search of 
solutions that simultaneously minimize all of these criteria. As is well 
known, it is generally not possible to satisfy all of the criteria 
simultaneously. The solution to this minimization problem (Eq. 5) is 
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generally not unique but takes the form of a Pareto surface that 
characterizes the trade-offs in its ability to satisfy all of the competing 
criteria. 10,81 

As a commonplace illustration, consider the migration of birds from 
Scandinavia to Africa and backwards that simultaneously considers flight 
time and energy use (Figure 2). For this situation, there is no single 
optimal solution. Rather, there is a family of tradeoff solutions along a 
FXUYHD FDOOHGD WRHWEPDDHFWKRQ W ' □ AfrSiIBffillhiJK$t I 
dimensions) in which improvement in one objective (say, reduction in 
flight time) comes only at the expense of increased energy use per day 
(second objective). In other words, moving from one solution to another 
along the Pareto surface will result in the improvement of at least one 
criterion while deteriorating at least one other. Analysis of the size and 
properties of the Pareto region can provide insights into possible model 
improvements as well as degrees of confidence in different aspects of the 
model predictions. 10,84 

Many computational approaches to deriving efficient estimates of the 
Pareto solution set have been proposed. 81,82,85,86 A pioneering algorithm 
that provides an approximation to the Pareto optimal region in a single 



► I-lighl direction 
DQOi Wintering grounds 

Fig. 2. An illustration of the Pareto optimality, a) Map showing migration paths of birds 
from Scandinavia to Africa and backwards, b) Birds trying to simultaneously optimize 
flight time and energy use. There exists a family of trade-off solutions along a curve 
FDOOHGDWXHRSJBOPIWRinptfpWriligt et al. 83 ) 
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optimization run is called the MOCOM-UA (Multi-Objective COMplex 
evolution algorithm) developed at the University of Arizona. 81 MOCOM- 
UA is a general-purpose multi-objective global optimization strategy that 
is based on an extension of the SCE-UA population evolution method. 62 
It combines the strengths of the controlled random 
search method, 58 Pareto ranking 87 and a multi-objective downhill simplex 
search method. 

In brief, the MOCOM-UA method starts with an initial sampling of a 
population of s points distributed randomly throughout the ^-dimensional 
feasible parameter space, 0. In the absence of prior information about 
the location of the Pareto region, a uniform sampling distribution is used. 
For each sampled point, the multi-criteria vector, F M c(6), is calculated 
and the population is ranked and sorted using the Pareto-ranking 
procedure presented by Goldberg 87 . Simplexes of n + 1 points are then 
selected from the population according to a robust rank-based selection 
method suggested by Whitley 88 . A multi-objective extension of the 
downhill simplex method is used to evolve each simplex in a multi- 
criteria improvement direction and generate new (better) points. Iterative 
application of the ranking and evolution procedures causes the entire 
population to converge towards the Pareto optimum. The procedure 
terminates automatically when the population converges to the Pareto 
Region with all points mutually non-dominated. The final population 
provides a fairly uniform approximation of the Pareto solution space 
P( 0). The details of the MOCOM-UA algorithm can be found in Yapo et 
al. 81 The MOCOM-UA code is available from the following web 
address: www.sahra.arizona.edu/software. 

MOCOM-UA has been successfully applied to a number of multi- 
criteria calibration studies of hydrologic 23 ’ 10 and land-surface models. 89,90 
However, some studies reported that the MOCOM-UA algorithm has a 
tendency to cluster the solutions in the center of Pareto region 91 and may 
require a very large number of model runs for convergence. 90 The 
MOSCEM-UA (Multiobjective Shuffled Complex Evolution Metropolis 
- University of Arizona) method 82 is especially designed to overcome 
some of these convergence problems, but its stochastic nature causes the 
Pareto solution set to contain irregularities with bumpy fronts / surfaces 
as a consequence. 

The AMALGAM evolutionary search algorithm recently developed 
by Vrugt and Robinson 75 , has shown to be the method of choice for 
solving multi-objective optimization problems. This method utilizes self- 
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adaptive multi-method evolutionary search and is more efficient and 
robust than existing search approaches including the commonly used 
Strength Pareto Evolutionary Algorithm (SPEA2) 92 and Non-Dominated 
Sorting Genetic Algorithm (NSGA-II). 93 The AMALGAM algorithm is 
initiated using a random initial population P 0 of size N, generated using 
Latin hypercube sampling. Then, each parent is assigned a rank using the 
fast non-dominated sorting (FNS) algorithm 93 . A population of offspring 
Q 0 , of size N, is subsequently created by using the multi-method search 
concept that lies at the heart of the AMALGAM method. Instead of 
implementing a single operator for reproduction, we simultaneously use 

k individual algorithms to generate the offspring, Q> = {£>£,.. .,£$}• 
These algorithms each create a pre-specified number of offspring points, 
N = {N' t ,...,7Vf } , from P 0 using different adaptive procedures. After 


creation of the offspring, a combined population R 0 = P 0 u Q 0 of size 2 N 
is created and R 0 ranked using FNS. By comparing the current offspring 
with the previous generation, elitism is ensured since all previous non- 
dominated members will always be included in P. 93,94,95 Finally, members 
for the next population P t are chosen from subsequent non-dominated 
fronts of R 0 based on their rank and crowding distance. 93 The new 
population Pi is then used to create offspring using the method described 
below, and the aforementioned algorithmic steps are repeated until 
convergence is achieved. 

To ensure thaWD WKHD 3 EHVW'D DOJRULWKPVD DUHD ZHLJKWHGD VRD WKDWI 
contribute the most offspring to the new population, we update 

{N),...,N*} according to: 


n\ = n 


n‘ 

iy t~\ 


yk P‘ T 1 

L i=l ^ I] 


i = 1 


( 6 ) 


The term (if/ is the ratio of the number of offspring points an 
algorithm contributes to the new population, if , and the corresponding 
number the algorithm created in the previous generation ( ). The rest 

of the expression scales the reproductive success of an individual 
algorithm to the combined success of all the algorithms. 

Benchmark results using a set of well-known multi-objective test 
problems show that AMALGAM is three to ten times more efficient than 
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existing multi-objective optimization algorithms. Initial applications to 
hydrologic parameter estimation problems have reported similar 
efficiency improvements. 96,97 

Notwithstanding this progress made, approximation of the entire 
Pareto surface can be computationally too expensive. This is especially 
true within the context of distributed hydrologic modeling, for which 
approximate Pareto-optimal solutions can be obtained by lumping the 
various individual performance criteria within a single aggregate 
function through scalarization: 

F agg (O) = 'to> i T i F i (0) ( 7 ) 

i=l 

where k denotes the number of criteria, and of^I\ D(£p/IIIDUHI IZHLJKWVI IDQGl I 

scaling transformations, respectively, applied to each criterion. Here, the 

weights define the relative contributions of each criterion to the 

aggregate function, F agg , such that a>, + a> 2 □«□ co k = 1, and can be 

subjectively assigned to place greater or lesser emphasis on some aspect 

of the model. Further, they can also be formalized to reflect our 

knowledge regarding the relative degree of uncertainty in the various 

measurement data sources. 98 It can be shown theoretically that if the 

Pareto region is convex, its shape can be well approximated by 

systematically varying the weights assigned to each of the criteria 

F t (G) over all possible values. Note that the above formulation also 

accounts for the fact that, in general, the various criteria F t (6) may not be 

directly commensurable and may, in fact, vary over very different orders 

of magnitude; the multiplier is used to compensate for this difference 

by transforming the criteria onto a commensurable scale. 85,11,99 


4,2. Step-wise Multi-criteria Calibration Approach 

The step-wise multi-criteria approach aims to reduce the dimension of 
the optimization problem by breaking it into several sub-problems that 
are handled in a step-by-step manner; each step considers different 
aspects of hydrologic response by transforming the flows, 100,101 by 
focusing on different time periods, 102,23,103,104 and/or different time 
scales. 105,9,106 For example, the Multi-step Automatic Calibration 
Scheme 100,101 (MACS) uses the SCE-UA global search algorithm and a 
step-by-step process to emulate some aspects of the progression of steps 
followed by an expert hydrologist during manual calibration. The step- 
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by-step process is as follows: (1) the lower zone parameters are 
calibrated to match a logarithmic transformation of the flows, thereby 
placing a strong emphasis on reproducing the properties of the low-flow 
portions of the hydrograph; (2) the lower zone parameters are 
subsequently fixed and the remaining parameters are optimized with the 
RMSE objective function to provide a stronger emphasis on simulating 
high-flow events; and (3) finally, a refinement of the lower zone 
parameters is performed using the log-transformed flows while keeping 
the upper zone parameters fixed at values derived during step two. The 
method has been tested for a wide variety of hydro-climatic regimes in 
the United States and has been shown to produce model simulations that 
are comparable to traditional manual calibration techniques. 100,101 

Other studies that utilize step-wise procedures in the calibration of 
hydrologic model parameters also exist. Brazil 102 estimated parameters of 
the SAC-SMA model using a combination of an interactive analysis of 
observed streamflow time series, an automated global search algorithm 
and a local search algorithm for fine-tuning. Similarly, Bingeman et al. 27 
utilized a hybrid approach combining manual and automatic approaches 
in a step-wise strategy to calibrate and evaluate the spatially distributed 
WATFLOOD watershed model. Boyle et al. 23 utilized a simultaneous 
multi-criteria approach to generate a set of Pareto optimal solutions, 
showing performance trade-off in fitting different segments of the 
hydrograph, and then selected a parameter set from within the Pareto 
region that best satisfies two long-time-scale statistical measures of fit 
(mean and variance). In a more sophisticated approach, Pokhrel et al. 107 
constructed the Pareto optimal solutions using traditional objective 
functions (i.e. RMSE and log-RMSE) and then employed 3 signature 
measures' of model performance 80 to select a parameter set from within 
the solutions in close proximity to Pareto region. In an effort to improve 
the performance of the United States Geological Survey (USGS) 
Precipitation Runoff Modeling System, Leavesley et al. 9 performed a 
step-wise approach which began by calibrating the parameters affecting 
water balance, followed by the parameters related to hydrograph timing 
and soils and vegetation, respectively. Harlin 108 and Zhang and 
Lindstrom 103 introduced process-oriented step-wise calibration strategies 
for the HBV hydrological model. Their procedures partitioned the flow 
time series into several periods, where specific hydrologic processes 
dominate, and linked these periods to specific parameter(s) during 
calibration. Turcotte et al. 105 developed a step-wise calibration strategy 
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for the HYDROTEL distributed model. In the procedure, they first 
calibrated the model parameters sensitive to objectives related to long 
timescales and then calibrated those parameters sensitive to short 
timescales. Their calibration steps included parameters related to large- 
scale water balances, evapotranspiration, infiltration capacity and 
routing, respectively. Shamir et al. 106 introduced a step-wise parameter 
estimation approach based on a set of streamflow descriptors that 
emphasize the dynamics of the streamflow record at different timescales. 
Fenicia et al. 84 showed how simultaneous and step-wise multi-criteria 
optimization strategies can help in understanding model deficiencies and 
hence guide model development. Wagener et al. 109 (DYNIA) and Choi 
and Beven 104 developed process-based approaches based on Generalized 
Sensitivity Analysis 110 ’ 111 (GSA) to incorporate the time-varying nature 
of the hydrologic responses into model/parameter identification. 
Specifically, the DYNIA approach adopts the GSA procedure within a 
sequential Monte Carlo framework to locate periods of high 
identifiability for individual parameters and to diagnose possible failures 
of model structures. Vrugt et al. 112 presented a similar idea to assess the 
information content of individual observations with respect to the various 
model parameters. Their method, called PIMLI, merges the strength of 
Bayesian Inference with Random Walk Metropolis sampling to resample 
the parameter space each time the most informative measurement is 
added to the objective function. Results showed that only a very small 
number of streamflow measurements was actually needed for reliable 
calibration of a parsimonious five-parameter rainfall-runoff model. 
Specifically, about 95% of the discharge observations turned out to 
contain redundant information. Similar conclusions were drawn for case 
studies involving subsurface flow and transport models. The 
development of PIMLI and DYNIA has been inspired by Sequential 
Monte Carlo (SMC) schemes that provide a generalized treatment of 
time-varying parameters and states, and are increasingly being used for 
posterior tracking within the statistical and computational science 
literature. 


4.3. Multi-criteria Constraining Approach 
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The basic concept of the constraining approach differs from automated 
calibration strategies, described above, in the sense that it considers 
3 consistency' of model structure/parameters as the ultimate goal, while 
the latter aims at optimality. In other words, the constraining approach 
seeks parameter estimates that are consistent with some minimal 
thresholds of performance on several criteria. Rooted in the concept of 
the Generalized Sensitivity Analysis (GSA), 110,111 models/parameters are 
separated into behavioral/non-behavioral groups by comparing then- 
performance with a subjectively selected threshold behavior. The 
models/parameters that present better performance than the selected 

WKUHVKROG □ DUHD DFFHSWHGD DVD 3 EHKD YLRUDO ' □ DQGD FRQVLGHUHG □ D 

acceptable representation of the system. The remaining 

models/parameters are rejected as non-behavioral. The Monte Carlo 

simulation-based GSA approach is at the core of many model 

identification and uncertainty estimation techniques (e.g. GLUE 

methodology of Beven and Binley 113 and DYNIA methodology of 

Wagener et al. 109 ). 

The main advantages of the constraining approach are its ease of 
implementation and use and its flexibility in incorporating additional 
performance criteria into the parameter estimation problem. Similar to 
the step-wise multi-criteria calibration approach (Sec. 4.2) several 
performance measures can be formulated and behavioral 

models/parameter sets can be selected considering a single or a group of 
performance measure(s). 80 The selected set (behavioral set) can be 
further analyzed using the concept of Pareto optimality to identify a 
VLQJOHD 3 EHVW □ SDUDPHWHU □ VHW ' □ 

We illustrate the constraining approach with an overview of the step- 
wise semi-automated methodology, recently proposed by Yilmaz et al. 80 
In Yilmaz et al. 80 , parameters of the spatially distributed SAC-SMA 
model were constrained based on a set of hydrologically meaningful 
3 VLJQD WXUH □ PHDTKhSIVigiiSture measures target and extract 
hydrologically relevant (contextual) information from the outlet 
streamflow observations in ways that correspond to the following 
behavioral functions of a watershed system: (1) Maintain overall water 
balance, (2) Vertically redistribute excess rainfall between fast and slow 
runoff components, and (3) Redistribute the runoff in time (influencing 
hydrograph timing and shape). The selected signature measures were 
defined in terms of percent bias between the following hydrologically 
meaningful indices (calculated using simulated and observed outlet 
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streamflow): the percent bias in runoff ratio (%BiasRR), various 
properties of the flow duration curve (high-flow volume, %BiasFHV; 
slope of the mid-segment, %BiasFMS; low-flow volume, %BiasFLV; 
and median flow, %BiasFMM) and a simple index of watershed lag-time 
(%BiasTLag). One advantage of using signature-based measures in a 
constraining approach is that they can take on both positive and negative 
values, thereby indicating the direction of improvement. The procedure 
starts with establishing relationships between signature measures and 
parameters of the SAC-SMA model (via a procedure rooted in random 
sampling). These relationships are then used to constrain the ranges of 
parameters towards regions of signature measure improvement. In the 
second step, additional random samples are generated from the 
constrained parameter ranges and behavioral parameter sets are selected 
by establishing thresholds on the signature measures. There are various 
ways in which these thresholds can be defined. In Yilmaz et al. 80 the 
performance of the SAC-SMA model with a priori parameter values was 
used as benchmark. The procedure for selection of the behavioral 
parameter sets is depicted in Figure 3; the names of the signature 
measures are listed on the x-axis, and the y-axis shows their 
corresponding values. Each line along the x-axis represents a parameter 
set. The gray-dashed line represents the performance of the model using 
a priori parameters and the shaded region envelops the signature 
measure improvement region (i.e. behavioral region), defined as ±1 times 
the a priori model performance. The solid line-triangle parameter 
combination falls entirely within the gray region, and therefore 
represents a behavioral parameter set. 


5. Automated Calibration of Spatially Distributed Watershed 

Models 

3r LVWULEXWHG' □ ZD WHU^W&f 1 WOHCfe\the potential ability to 
simulate the spatial distribution of hydrologic processes in the watershed 
of interest, as well as to provide estimates of discharge volume along the 
entire length of the channel network so that, for instance, the dynamic 
evolution of flood inundation regions can be estimated. 

Although their potential benefits are clear, the spatial complexity of 
these processes is perceived to be an obstacle to the proper identification 
of distributed model components and parameters, translating into 
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Signature Measure 


Fig. 3. Illustration of the parameter constraining approach proposed by Yilmaz et al. 80 
Thresholds are selected based on the signature measure values for the model using a 
priori parameter sets (gray-dashed line). Behavioral region is defined as ±1 times the a 
priori model performance (shaded region). Solid-line-triangle and solid-line-circle 
represent a behavioral and a non-behavioral parameter set, respectively. (After Yilmaz et 
al. 80 ) 


significant predictive uncertainty in the model results. 118 Therefore, 
controversy still persists regarding how such models should be 
implemented. 119 ’ 120,121 For example, Phase One of the recent Distributed 
Model Intercomparison Project 122 (DMIP-1) FRQFOXGHV □ WKDW □ 3 UHOLDEOH I IDQGD 
objective procedures for parameter estimation, data assimilation, and 
HUURU □ FRUUHF WLRQ ' □ VWLOOl IQHHGH WRDEHD GHYHORSHGl I 
An important strength of distributed watershed models is that then- 
parameters (and structure) can, in principle, be inferred from data about 
watershed physical characteristics via a local a priori parameter 
estimation approach (See Sec. 2 for details). However, in general, 
parameters estimated using a local a priori approach continue to require 
some degree of calibration (adjustment of the model parameter fields) to 
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ensure that the input-state-output behavior of such models remains 
consistent with the available data. 10,11,12,123 

In calibration of watershed models having spatially distributed 
parameter fields, the number of parameter values to be estimated can be 
quite large, resulting in an ill-posed optimization problem. Note that the 
watershed outlet streamflow time series, the most widely used hydrologic 
variable in calibration of watershed models, represents the integrated 
response of the catchment, and therefore the effects of sub-catchment 
(local) scale variability in hydraulic properties has been averaged out to 
some extent. Therefore, it will typically only contain information about 
the watershed scale properties of the parameter fields (the mean values 
and the broad scale spatial patterns), rather than the smaller scale patterns 
of variation. As a consequence, the number of unknowns (parameters) 
will be larger than the one that can be identified based on the information 
content of the streamflow time series data, and hence the estimated 
parameters will most likely be unrealistic. 

Part of the solution to the stabilization of ill-posed problems of this 
kind is to recognize that the spatially distributed elements of the model 
parameter fields are not, in fact, independent entities that can take on 
arbitrary values in the parameter space. Instead, their values are 
somehow related to the spatial distributions of various hydrologically 
relevant watershed characteristics, including, for example, geology, soil 
type, vegetation, topography, etc. (e.g. Grayson and Bloschl 124 ). 
Recognition of these dependencies can facilitate implementation of 
regularization relationships that help to constrain the dimensionality of 
the parameter estimation problem (e.g. Pokhrel et al. 125 ; Doherty and 
Skahill 126 ). 

Regularization is a mathematical technique that allows stabilization 
of the otherwise ill-posed (over-parameterized) parameter estimation 
problems through introduction of additional information about the 
parameters. 127,128,129 ’ 130 Two kinds of regularization techniques are in 
common use and these help to: 1) improve conditioning of the 
optimization problem through use of additional information and/or 2) 
reduce the dimensionality of the parameter search space. 126,131,125 An 
example of the former technique is called Tikhonov Regularization 
(TR), 127 which employs a penalty function approach 132 wherein the 
original parameter dimension is retained, while the shape of the global 
objective function to be optimized, F G (6, p), is modified by a 
regularization objective function, F par (6)\ 
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F c (0, M )=F(g)+M- (8) 

where F par (G) represents a penalty applied on the solutions that deviate 
from satisfying the regularization constraints. The regularization (or 
penalty) parameter, p, is the weight assigned to the regularization 
constraints; greater values assigned to p will result in improved 
parameter reasonableness and stability, although possibly at the cost of a 
poorer fit between model outputs and measurements (e.g. Yilmaz 133 ). 
Sometimes, an appropriate value for p can be assigned through 
knowledge of the underlying uncertainty associated with the various 
components of F par (6). Such information, however, is often unavailable 
and thus alternative methods have been developed (e.g. Doherty 130 ; 
Mertens et al. 98 ; Yilmaz 133 ). An example of a regularization technique 
with the potential to significantly reduce the dimensionality of the 
estimable parameter space is the Truncated Singular Value 
Decomposition (TSVD). This technique only focuses on dominant 
directions and excludes search directions that are associated with 
negligible eigenvalues with little or no function sensitivity. 134,135,136 
Either separately or together, Tikhonov regularization and TVSD have 
been used to address high-dimensional inverse modeling problems in 
hydrology. 137,138,139,126,140 Vrugt et al. 141 have recently completed the 
development of a global optimization algorithm with self-adaptive 
subspace learning that uses TVSD to continuously update the dominating 
eigenvectors estimated from an evolving population of solutions. Initial 
results demonstrate significant efficiency improvements in solving high- 
dimensional parameter estimation problems (more than 200 parameters). 
A beta version of this algorithm has been developed in MATLAB and 
can be obtained from the second author upon request. 

A simple and commonly used regularization technique to reduce the 
dimensionality of the spatially distributed parameters of watershed 
models seeks to characterize and preserve the pattern of relative spatial 
variation provided by the (local) a priori parameter fields (or maps). In 
its simplest form, one scalar multiplier per a priori parameter field is 
used to vary only the mean level of the parameter field, while 
constraining the values to remain within their pre-defined physical 
ranges (see Bandaragoda et al. 142 and White et al. 143 , among others). 
More sophisticated approaches utilize both additive and multiplicative 
terms, 144,9 non-linear transformations via a single parameter, 80 or more 
complex approaches 125 that are combinations of the above. These 
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techniques simplify the calibration problem to that of finding the 
parameters of a transformation function (e.g. multipliers) that modifies 
the parameter fields in such a way that the model performance is 
improved, while preserving, to some extent, the spatial patterns of the 
parameters. As an example, Figure 4a shows a grid overlay of the 
distributed version of the SAC-SMA model for Blue River Basin, near 
Blue, Oklahoma. The grid shown is the a priori field for the upper zone 
free water maximum (UZFWM) parameter (size of the upper layer free 
water tank) estimated via Koren et al. 7 approach. The histogram in Figure 
4b displays the distribution of a priori values of parameter UZFWM 
within its feasible range (x-axis limits). Figure 4c shows the UZFWM 
distributions upon transformation via scalar multiplier. It can be seen that 
multiplication has a large impact on the variance of the distribution of 
UZFWM in the form of expansion (multiplier >1) and compression 
(multiplier < 1). This approach creates problems when any of the 
individual grid values in each parameter field exceeds its specified 
(physically reasonable) bounds; either the parameter distribution must be 
truncated so that any values exceeding the range are fixed at the 
boundaries or the mean level must be prevented from varying over the 
entire range. An alternative, and more flexible, approach that removes 
these restrictions has been proposed by Yilmaz et al. 80 , which utilizes a 
non-linear transformation with one-parameter (P -parameter) to vary the 
entire parameter field. Notice in Figure 4d that as the P -parameter is 
varied towards either of its limiting values, (0,2], the variance of the 
parameter distribution is compressed so as to keep the entire distribution 
within the feasible range, while preserving the monotonic relative 
ordering of parameter values in the field. 

Two recent approaches which have not been discussed so far, but 
show great promise to solving the hydrologic model calibration problem 
are the Dynamically Dimensioned Search Algorithm 145 and the 
Constrained Optimization using Response Surfaces method 146 ’ 147 . These 
two algorithms work well with a limited budget of function evaluations, 
and are admirably suited for computationally demanding distributed 
watershed models, which require between minutes and hours to run on a 
single processor. 

The Dynamically Dimensioned Search (DDS) Algorithm 145 is a 
stochastic, single-objective search algorithm designed to rapidly find 
3 JRRGD TXDOLWV □ JOREDO □ WOXWLRQJf5l6f (t&MpjDQQdr □ "6 □ 
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Fig. 4. The effect of using transformation functions on the distribution of UZFWM 
parameter values, (a) Map showing the a priori field of the UZFWM parameter (size of 
the upper layer free water tank) within the distributed version of the SAC-SMA model 
setup for the Blue River Basin, near Blue, Oklahoma. Histograms showing the 
distribution of parameter values within UZFWM parameter field: (b) a priori specified 
values, (c) parameter values transformed using scalar multipliers, (d) parameter values 
DTWHU IXVMpbrmation function proposed by Yilmaz et al. 80 

space globally, but when the search progresses the method settles down 
with emphasis on local parameter refinements. The transition from global 
to local search is achieved by dynamically and probabilistically reducing 
the number of search dimensions (i.e. the number of model parameter 
values being perturbed) in the neighborhood of the current best estimate. 
The DDS algorithm has many elements in common with evolutionary 
strategies, and has only one algorithmic parameter. This is the 
neighborhood perturbation size, which is calculated from a multi-normal 
distribution with standard deviation equal to some fraction of the initial 
range of each decision variable. The user specified inputs are the initial 
solution and the maximum number of model evaluations. The DDS 
algorithm is very simple to implement, and has been used in calibration 
of a rainfall runoff model 145 , and a land surface scheme . 148 The DDS 
algorithm can also be used to assess parameter uncertainty by using 
multiple trails with different starting points and analyze their joint 
sampling paths . 149 Such an approach violates first-order Markovian 
properties and can therefore only provide a rough estimate of parameter 
uncertainty (see next Section). Recent work has compared the 
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performance of DDS with SCE-UA for a range of studies . 145,150,151 
Behrangi et al . 150 has introduced an alternative SCE-UA algorithm that is 
at least as efficient as DDS when a limited budget of function evaluations 
is considered. 

Constrained Optimization using Response Surfaces 146,147 (CORS) is a 
response surface approximation method. A response surface model is a 
multivariate approximation of the black box objective function and used 
as surrogate for optimization in situations where function evaluations are 
computationally expensive. In the CORS method the next point 
(parameter combination) is chosen to be the one that minimizes the 
current response surface model subject to various constraints including 
that the new point must be of some minimum distance from previously 
evaluated points. This distance is sequentially reduced from a high value 
(global search) at the start of the search to a low value (local search) at 
termination. This method has shown to converge nicely to the global 
optimum for a set of continuous functions. 


6. Treatment of Parameter Uncertainty 

One major weakness of the automated calibration methods discussed 
earlier in this chapter is their underlying treatment of the uncertainty as 
being primarily attributable to the model parameters, without explicit 
treatment of input, output and model structural uncertainties. It is well 
known, however, that the uncertainties in the watershed modeling 
procedure stem not only from uncertainties in the parameter estimates, 
but also from measurement errors associated with the system inputs and 
outputs, and from model structural errors arising from the aggregation of 
spatially distributed real-world processes into a relatively simple 
mathematical model. Not properly accounting for these errors results in 
residuals that exhibit considerable variation in bias (non-stationarity), 
variance (heteroscedasticity) and correlation structures under different 
hydrologic conditions and, hence, undermines our efforts to derive 
meaningful parameter estimates that properly mimic the target 
hydrologic processes. 

One response to the treatment of uncertainty in the parameter 
HVWLPD WLRQ □ SUREOHP □ LVD WRD DEDQGRQ □ ' WKHD VHDUFKD IRU ODD VLQ JOH □ 3 EE 
combination and adopt a Bayesian viewpoint, which allows the 
identification of a distribution specifying the uncertainty regarding the 
values of the model parameters. Bayesian statistics have recently found 
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increasing use in the field of hydrology for statistical inference of 
parameters, state variables and model output prediction . 152 ’ 153 ’ 154 ’ 155,156,157 
For a recent review, see Liu and Gupta 157 . The Bayesian paradigm 
provides a simple way to combine multiple probability distributions 
using Bayes theorem. In a hydrologic context, this method is suited to 
systematically address and quantify the various error sources within a 
single cohesive, integrated and hierarchical method. 

To successfully implement the Bayesian paradigm, sampling methods 
that can efficiently summarize the posterior probability density function 
(pdf) are needed. This distribution combines the data likelihood with a 
prior distribution using Bayes theorem, and contains all the desired 
information to make statistically sound inferences about the uncertainty 
of the individual components in the model. Unfortunately, for most 
practical hydrologic problems, this posterior distribution cannot be 
obtained by analytical means or by analytical approximation. For this 
reason, researchers commonly resort to iterative approximation methods, 
such as Markov Chain Monte Carlo (MCMC) sampling, to generate a 
sample from the posterior pdf. 


6.1. Random Walk Metropolis (RWM) algorithm 

The basis of the MCMC method is a Markov chain that generates a 
random walk through the search space with stable frequencies stemming 
from a fixed probability distribution. To visit configurations with a stable 
frequency, an MCMC algorithm generates trial moves from the current 
( 3 old' ) position of the Markov chain 
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hydrology is the Shuffled Complex Evolution Metropolis (SCEM-UA) 
global optimization algorithm. 155 This method is a modified version of 
the original SCE-UA global optimization algorithm, 62 and runs multiple 
chains in parallel to provide a robust exploration of the search space. 
These chains communicate with each other through an external 
population of points, which are used to continuously update the size and 
shape of the proposal distribution in each chain. The MCMC evolution is 
repeated until the R statistic of Gelman and Rubin 159 indicates 
convergence to a stationary posterior distribution. This statistic compares 
the between- and within-variance of the different parallel chains. 

Numerous studies have demonstrated the usefulness of the SCEM- 
UA algorithm for estimating (nonlinear) parameter uncertainty. 
However, the method does not maintain detailed balance at every single 
step in the chain, casting doubt on whether the algorithm will 
appropriately sample the underlying pdf. Although various benchmark 
studies have reported very good sampling efficiencies and convergence 
properties of the SCEM-UA algorithm, violating detailed balance is a 
reason for at least some researchers and practitioners not to use this 
method for posterior inference. An adaptive MCMC algorithm that is 
efficient in hydrologic applications, and maintains detailed balance and 
ergodicity therefore remains desirable. 


6.2. DiffeRential Evolution Adaptive Metropolis (DREAM) 

Vrugt et al. 160 recently introduced the DiffeRential Evolution Adaptive 
Metropolis (DREAM) algorithm. This algorithm uses differential 
evolution as genetic algorithm for population evolution, with a 
Metropolis selection rule to decide whether candidate points should 
replace their respective parents or not. DREAM is a follow up to the DE- 
MC (Differential Evolution Markov Chain) method of ter Braak 161 , but 
contains several extensions to increase search efficiency and acceptance 
rate for complex and multimodal response surfaces with numerous local 
optimal solutions. Such surfaces are frequently encountered in 
hydrologic modeling. The method is presented below. 


1. Draw an initial population 



36 


K.K. Yilmaz et al 


FOR i <- 070 (CHAIN EVOLUTION) 

3. Generate a candidate point, S' in chain i. 


where 
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8. Compute the Gelman-Rubin ^ convergence diagnostic 159 for 

each dimension j □ UU<d lusing the last 50% of the samples in each 
chain. 

9. If fit < 1.2 for all j, stop, otherwise go to CHAIN EVOLUTION. 
At every step, the points in 
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6.3. Sequential Data Assimilation 

In the past few years, ensemble-forecasting techniques based on 
Sequential Data Assimilation (SDA) methods have become increasingly 
popular due to their potential ability to explicitly handle the various 
sources of uncertainty in environmental modeling. Techniques based on 
the Ensemble Kalman Filter 163 (EnKF) have been suggested as having 
the power and flexibility required for data assimilation using nonlinear 
models. In particular, Vrugt et al. 164 recently presented the Simultaneous 
Optimization and Data Assimilation (SODA) method, which uses the 
EnKF to recursively update model states while estimating time-invariant 
values for the model parameters using the SCEM-UA 155 optimization 
algorithm. A novel feature of SODA is its explicit treatment of errors due 
to parameter uncertainty, uncertainty in the initialization and propagation 
of state variables, model structural error and output measurement errors. 
The development below closely follows that of Vrugt et al. 164 

To help facilitate the description of the classical Kalman Filter (KF), 
we start by writing the model dynamics as a stochastic equation. In 
keeping with Figure 1, consider a model & in which the discrete time 
evolution of the state vector y/ is described with: 

Yi+i = QKYv x i> Q) + <li ( 14 ) 

where X represents the observed forcing (e.g. boundary conditions), 6 is 
the vector of parameter values, i denotes time and q t is a dynamical noise 
term representing errors in the conceptual model formulation. This 
stochastic forcing term flattens the probability density function of the 
states during the integration. We assume that the model predictions are 
related to its internal state according to: 

S t = H(y t )+e t e { ~N( 0,o?) (15) 

where H(-) is the measurement operator, which maps the state space into 
measurement or model output space, and also has a random additive 
error e u called the measurement error. Note that of denotes the error 
deviation of the measurements, and y/ represents the true model states. 
At each measurement time, when an output observation becomes 
available, the output forecast error z, is computed: 

z t = Si -H(yf) (16) 

and the forecasted states, y /{ , are updated using the standard KF analysis 
equation: 
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tf= V ,{+K i (s i -H( V ,{)) (17) 

where is the updated or analyzed state, and K t denotes the Kalman 
gain. The size of the gain directly depends on the size of the 
measurement and model error. 

The analyzed state then recursively feeds the next state propagation 
step into the model: 

rL = < (18) 

The virtue of the KF method is that it offers a very general framework 
for segregating and quantifying the effects of input, output, and model 
structural error in watershed modeling. Specifically, uncertainty in the 
model formulation and observational data are specified through the 
stochastic forcing terms q and s, whereas errors in the input data are 
quantified by stochastically perturbing the elements of X. 

The SODA method is an extension of traditional techniques in that it 
uses the Ensemble Kalman Filter (EnKF) to solve for equations (14) ± 
(18). The EnKF uses a Monte Carlo (MC) method to generate an 
ensemble of model trajectories from which the time evolution of the 
probability density of the model states and related error covariances are 
estimated. 165 The EnKF avoids many of the problems associated with the 
traditional extended Kalman Filter (EKF) method. For example, there is 
no closure problem as is introduced in the EKF by neglecting 
contributions from higher-order statistical moments in the error 
covariance evolution. Moreover, the conceptual simplicity, relative ease 
of implementation and computational efficiency of the EnKF make the 
method an attractive option for data assimilation in the meteorologic, 
oceanographic and hydrologic sciences. 

In summary, the EnKF propagates an ensemble of state vector 
trajectories in parallel, such that each trajectory represents one realization 
of generated model replicates. When an output measurement is available, 
each forecasted ensemble state vector y/{ is updated by means of a 
linear updating rule in a manner analogous to the Kalman Filter. A 
detailed description of the EnKF method, including the algorithmic 
details, is found in Evensen 163 and so will not be repeated here. 


7. Parallel Computing 
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The traditional implementation and application of the local and global 
optimization methods, discussed herein, involves sequential execution of 
the algorithm using the computational power of a single Central 
Processing Unit (CPU). Such an implementation works acceptably well 
for relatively simple optimization problems, and those optimization 
problems with models that do not require much computational time to 
execute. However, for high-dimensional optimization problems 
involving complex spatially distributed models, such as the ones 
frequently used in the field of environmental science, this sequential 
implementation needs to be revisited. Most computational time required 
for calibrating parameters in watershed models is spent running the 
model code and generating the desired output. Thus, there should be 
large computational efficiency gains from parallelizing the algorithm so 
that independent model simulations are run on different nodes in a 
distributed computer system. General-purpose parallel versions of SCE- 
UA, MOCOM, MOSCEM-UA, AMALGAM and DREAM have been 
developed in Octave and can be obtained from the second author upon 
request. 


8. Summary and Conclusions 

This chapter reviewed the current state-of-the-art of model calibration in 
watershed hydrology. The current status in watershed modeling is that, 
regardless of their type (i.e. conceptual/physically-based) and structural 
complexity (e.g. spatially lumped/distributed), watershed models still 
need to be calibrated to improve the reliability of their 
simulation/prediction performance. In calibration, model parameters are 
systematically adjusted in such a way that the behavior of the model 
approximates, as closely and consistently as possible, the historical 
observed response of the watershed system under study. In parallel with 
advances in computing resources and measurement techniques, 
hydrologic models are becoming increasingly complex, not only 
representing the spatial heterogeneity of a watershed but also simulating 
various internal states and fluxes. This increase in complexity is typically 
accompanied by a decrease in identifiability (or justification) of the 
components and parameters in these models due to problems with direct 
observation and noisy data. This leads to highly uncertain model 
predictions. There is, therefore, a pressing need for better approaches to 
model calibration. 



Model Calibration in Watershed Hydrology 


41 


We discussed manual and automatic parameter estimation techniques 
for calibration of lumped and spatially distributed watershed models. 
Historically, manual calibration has been the common practice for model 
calibration. Its main strength is the use of expert knowledge to match the 
perceived hydrologic processes in the watershed with their equivalents in 
the model, and thus they help to prescribe consistent values for the 
parameters. However, manual calibration is time and labor intensive, 
involving many subjective decisions. The era of digital computing and 
advances in hardware power and speed has led to the development of 
automated procedures that are more objective to implement and use, and 
more efficient in systematically searching IRUCI WKHI 1 3 \faHMWof Ithe 
model parameters. We have discussed the historical development of 
single criterion methods and provided details of the widely used SCE- 
UA global optimization algorithm. The major limitation of automated 
algorithms is that they rely on a single regression-based aggregate 
measure of model performance, which often leads to an ill-posed 
parameter estimation problem. This is due to a loss of information when 
projecting from the high-dimensional data space down to the single 
dimension of the residual-based summary statistic. For highly complex 
models with many interacting parameters, this approach is often poor at 
discriminating between the effects of individual parameters on the 
simulated model output. This, therefore, unnecessarily introduces non- 
uniqueness and equifmality in parameter estimation. 

More sophisticated automated (or semi-automated) methods for 
parameter adjustment have been developed within the framework of 
multi-criteria theory, thereby more closely matching the number of 
unknowns (the parameters) with the number of pieces of information (the 
measures). A major advantage of the multi-criteria approach is that 
various aspects of the manual calibration strategy can be absorbed into 
the calibration process, thus strengthening the physical basis of the 
identified parameters. Although multi-criteria methods are widely used 
in the calibration of watershed models and powerful algorithms have 
been developed (among these we discussed MOCOM-UA, MACS, 
DYNIA, PIMLI, AMALGAM and MOSCEM) in the past decade or so, 
how to properly incorporate expert knowledge into automatic parameter 
estimation still remains difficult, and an active area of research. To better 
realize the strengths of multi-criteria optimization requires a shift from 
statistical regression-based performance measures towards more 
powerful and hydrologically relevant (contextual) measures of 
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information. 77,80 One possible way to develop these signature-based 

PHDVXUHV □ LVD WKHI GHWHF WLRQ □ RID FKDUDFWHULVWLF □ RUD 3 VLJQDWXUH' C 
spatio-temporal data and by relating these to their causal mechanisms. 

Signature measures that quantitatively summarize these signature 
patterns can be used to estimate plausible model parameters and further 
point towards possible causes of model failure and guide model 
improvement strategies. More research is warranted on how to 1) 
properly construct different and complementary signature measures of 
model performance that are diagnostic, 77,80 while being representative of 
the watershed response behaviors that are deemed important to reproduce 
with the model; and 2) integrate these signature measures into automated 
calibration techniques while keeping the resulting multi-criteria 
optimization problem computationally tractable. 

We have also discussed the calibration of spatially distributed 
watershed models. Calibration of such models is complicated by then- 
large number of parameters. Part of the solution to the stabilization of ill- 
posed problems of this kind is to recognize that the spatially distributed 
elements of the model parameter fields are not, in fact, independent 
entities that can take on arbitrary values in the parameter space. Instead, 
their values are somehow related to the spatial distributions of various 
hydrologically relevant watershed characteristics including, for example, 
geology, soil type, vegetation, topography, etc. Recognition of these 
dependencies can facilitate implementation of regularization 
relationships that help to constrain the dimensionality of the parameter 
estimation problem. We would like to emphasize that although 
application of regularization techniques in calibration of complex 
watershed models is in its infancy (e.g. Pokhrel et al. 125 ), considerable 
progress has already been reported in adjacent fields, such as 
groundwater and subsurface hydrology. Methods and approaches 
developed therein might be of great use and inspiration to solve 
parameter estimation problems in watershed hydrology. Particularly 
appealing are approaches that attempt to 1) improve conditioning of the 
optimization problem through use of additional information (penalty 
functions; e.g. Tikonov and Arsenin 127 ); 2) reduce the dimensionality of 
the optimization problem using concepts of sensitivity and principle 
component analysis (e.g. Tonkin and Doherty 139 ), as discussed briefly 
herein; and 3) representer-based calibration methods. 166 Further, the 
availability of parallel computing and powerful optimization algorithms 
that can efficiently handle a high number of parameters in complex 
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distributed hydrologic models is of particular importance for developing 
rigorous model calibration strategies for such models. 75 

Finally, this chapter has discussed emerging methods for parameter 
uncertainty estimation. Uncertainty quantification is currently receiving a 
surge in attention in hydrology, as researchers try to better understand 
what is well and what is not well understood about the watersheds that 
are being studied and as decision makers push to better quantify accuracy 
and precision of model predictions. Various methodologies have been 
developed in the past decade to better treat uncertainty. We have 
highlighted the standard RWM for posterior exploration, and have 
discussed recent advances in MCMC sampling by combining multi-chain 
population evolution with the genetic algorithm Differential Evolution 
and subspace sampling. We have also summarized emerging state-space 
filtering methods that rely on the Ensemble Kalman Filter to specifically 
treat forcing (input), output, parameter and state error, within a recursive 
estimation framework. 

Among emerging approaches for uncertainty estimation not discussed 
in this chapter include methods such as Bayesian Estimation of Structure 
(BESt), 167 Bayesian Model Averaging (BMA), 168,169,170,171 and its 
maximum likelihood (ML) variant, MLBMA 172,173,174 . Such methods 
seem particularly appealing because they help to account for model 
structural uncertainty. BMA and MLBMA, in particular, are relatively 
easy to implement, accounting jointly for conceptual model and 
parameter uncertainty by considering a set of alternative models that may 
be very different from each other while being based on a common set of 
data. 
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SUMMARY 

Hydrologic models use relatively simple mathematical equations to conceptualize and aggregate the 
complex, spatially distributed, and highly interrelated water, energy, and vegetation processes in a 
watershed. A consequence of process aggregation is that the model parameters often do not represent 
directly measurable entities and must, therefore, be estimated using measurements of the system inputs and 
outputs. During this process, known as model calibration, the parameters are adjusted so that the behavior 
of the model approximates, as closely and consistently as possible, the observed response of the hydrologic 
system over some historical period of time. This Chapter reviews the current state-of-the-art of model 
calibration in watershed hydrology with special emphasis on our own contributions in the last few decades. 
We discuss the historical background that has led to current perspectives, and review different approaches 
for manual and automatic single- and multi-objective parameter estimation. In particular, we highlight the 
recent developments in the calibration of distributed hydrologic models using parameter dimensionality 
reduction sampling, parameter regularization and parallel computing. Finally, this chapter concludes with a 
short summary of methods for assessment of parameter uncertainty, including recent advances in Markov 
chain Monte Carlo sampling and sequential data assimilation methods based on the Ensemble Kalman 
Filter. 



